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Abstract: Seeders (peers that do not request anything but contribute to the 
system) are a powerful concept in peer-to-peer (P2P). They allow to leverage 
the capacities of a P2P system. While seeding is a natural idea for fileshar- 
ing or video-on-demand applications, it seems somehow counter-intuitive in the 
context of live streaming. This paper aims at describing the feasibility and 
performance of P2P live seeding. 

After a formal definition of "live seeding" and efficiency, we consider the 
theoretical performance of systems where the overhead is neglected. We then 
propose a linear overhead model and extend the results for this model, for a 
single seeder and for a set of seeders as well (it is not always possible to perfectly 
aggregate individual efficiencies in a given system). 
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Utilisation de pairs auxiliaires 
pour de la diffusion pair-a-pair en direct 

Resume : Les semeurs (pairs possedant deja un contenu donne et contribuant 
a sa dissemination) sont un concept cle du pair-a-pair (P2P). lis permettent 
entre autres d'accroitre les performances d'un systeme P2P. Mais alors qu'il 
est naturel d'avoir des semeurs dans le contexte du partage de fichiers ou de 
la video-a-la-demande, cela semble incompatible avec de la diffusion en direct. 
Le but de ce rapport est de montrer que, dans une certaine mesure, cela est 
realisable. 

Apres avoir defini formellement le concept de semeur pour la diffusion en 
direct, et propose une definition d'emcacite, nous regardons la performance 
theorique des semeurs pour des systemes ou le cout de controle est neglige. 
Nous proposons ensuite un modele avec cout de controle affine, et donnons les 
resultats pour un semeur unique tout comme pour un ensemble de semeurs (un 
ensemble de semeurs ne se comporte pas necessairement aussi bien que la somme 
de ses elements). 

Mots-cles : pair-a-pair, diffusion en direct, bande passante, semeurs, efHcacite 
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1 Introduction 

Upload bandwidth is one of the main bottleneck in peer-to-peer (P2P) content 
distribution, which relies on the upload capacity of its participants to achieve 
its purpose. The upload resource is all the more critical since most todays 
high speed Internet access are asymmetric DSLs connections that are not de- 
signed to handle P2P traffic and offer relatively low upload capacity, with typical 
uplink/downlink ratios between 1/4 and 1/20. On the one hand, the democra- 
tization of very high speed, symmetric, Internet access like FTTH is expected 
to improve the upload capacity of P2P systems, but on the other hand the evo- 
lution of content quality standards makes the requirements in terms of content 
size and rate higher and higher: earlier video feeds on the Internet where low 
quality, requiring streamrates of a few hundred kbps, whereas HDTV implies 
rate of up to 20 Mbps, possibly more with the upcoming of 3D video content. 
It is therefore likely possible that upload will still be a major bottleneck of 
tomorrow's P2P content distribution. 

1.1 Motivation 

In order to increase the available resources, a standard P2P technique is to 
leverage the capacity of the system by using seeders, i.e. peers that contribute 
to the system but are (currently) not needing anything. Using seeders is quite 
natural for file-sharing or Video-on-Demand: after a peer has downloaded its 
file or video, it becomes a potential seeder for that content. However, it is 
counter-intuitive live streaming systems: "live" content is created on the fly, so 
it cannot be pro-actively possessed by peers. Therefore, for a peer to act as a 
seeder, it has to receive at least a part of the corresponding content, which it 
does not want to watch by definition. 

1.2 Scope and contribution 

The goal of this paper is to describe the feasibility and performance one can ex- 
pect from P2P live seeding from a bandwidth budget perspective. This generic 
theoretical framework can be used to derive simple dimensioning rules and rec- 
ommendations for the design of P2P live streaming with seeders. 

In details, we analyze the seeders' efficiency, which is the useful throughput 
(goodput) they add to the system, compared to their upload capacity. We 
provide explicit, tight, upper bounds for efficiency, taking the overhead explicitly 
into account. We also address the aggregation issues that come from using 
several seeders. We give conditions and simple diffusion schemes that allow to 
nearly achieve the theoretical bounds, and provide a few simple examples that 
illustrate the potential of our findings. 

Remark focusing on a single scenario (live streaming) and a single type of 
peer (seeders) was a deliberate choice, in order to get a clean framework for 
investigating theoretical performance, especially with regards to the overhead 
modeling aspects. This does not preclude of possible extensions of the approach 
presented here to other use cases. 
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1.3 Roadmap 

The next Section introduces the models that we use to derive our results. The 
related work with respect to P2P bandwidth dimensioning is briefly exposed in 
Section [3] In Section [H a formal definition of seeders' efficiency is proposed. 
Section[S]proposes a preliminary study of efficiency for two overhead-free models. 
This study is a starting point for the main results of this paper, which derive the 
efficiency of seeders in a model with explicit overhead (Section [5]) . The validity 
conditions and applications of the results are discussed in Section [7j Section [8] 
concludes. 

2 Model 

We consider a live content that needs to be streamed to a set of users at a 
constant rate r. The delivery is handled by a P2P live streaming system. The 
specificity of live streaming is that the content cannot be prefetched. A play-out 
buffer may tolerate some jitter, but the live constraints usually limit the size of 
that buffer to less than a few seconds, so a conservative, yet realistic assumption 
is that content must received at exactly the rate r during the whole watching 
experience. To compare with, filesharing usually requires no minimal rate, while 
in the case of Video-on-Demand, content may be prefetched at a rate greater 
than r. 

2.1 C/S/L systems 

We classify the nodes of the system into three categories: 

• Central servers are in charge of injecting initial copies of the stream into 
the system. We assume they have a cumulated bandwidth capacity that 
allows to inject Nc copies of the stream, with Nc > 1. 

• Leechers are peers that want to watch the live content. 

• 5eeder43 are peers that do not want to watch the live content, but can 
provide bandwidth to the system. 

Remark we do not focus on the way seeders could be enforced in a real live 
streaming system. However, most of the ideas from P2P filesharing or VoD 
systems should apply to P2P live streaming. For instance: 

• Some peers may remain connected to the system even when idle. 

• In a multi-channel system, leechers from an overprovisioned channel may 
act as seeders for another channel that lacks resources. 

• A share-ratio policy can encourage the peers to seed: peers that do not 
offer enough instant bandwidth may have to act as seeders for a while in 
order to "pay" their bandwidth debt. That kind of policy can be enforced 
through penalties (no service guarantee, reduced catalog) and rewards 
(higher QoS, access to premium content). 

1 The terms leecher and seeder comes from the BitTorrent vocabulary [3]. 
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• In the case of networks managed by some ISP or content provider, man- 
aged seeders may be deployed by the provider to enhance the system 
performance. 

We denote by C, L and S the sets of servers, leechers and seeders respectively. 
The number of leechers (resp. seeders) is denoted by Nl (resp. Ns). Every peer 
p in L or S has an upload capacity u p > devoted to the service. We assume 
that the download capacity is always sufficient to support the content rate r and 
a possible overhead. Ux and ux are respectively the total and average upload 
band widths of set X (ux = jf")- 

Note that the bandwidth distribution of the seeders may differ from the one 
of the leechers. For instance, if seeders are former leechers forced to remain 
because of some share-ratio policy, low bandwidth peers will have to seed longer 
PQ, so the average seeders' bandwidth will be lower than the leecher's one. One 
the other hand, seeders deployed by some content provider should probably have 
higher bandwidths. 

A diffusion scheme for the system is a policy that describes how the content 
is distributed. We assume here static diffusion schemes: between any two peers 
(or servers) p and q, the scheme gives a stream of goodput < r p<q < r that is 
sent from p to q. If < r p ^ q < r, r p<q is called a substream. For convenience, 
we consider that the substreams received by a given peer are non-overlapping, 
so a peer p receives an input of rate 

ip = y ] T q,v (1) 

q£{L,S,C} 

Remark overlapping substreams can always be seen as non-overlapping ones: 
if r Pi g and r Stq are overlapping, with redundant data of rate r pns-q , we just have 
to consider f p>q := r p>q — r priSiq and see a rate r p n s , q from p to q as overhead. 
Of course, choosing which redundant data is treated as overhead is arbitrary. 

Servers apart, a node cannot send something it doesn't possess, so a diffusion 
scheme verifies the condition 

Vp,<? e {L,S},r p , q < i p . (2) 

A scheme is a solution of the live diffusion if it ensures that all leechers can 
view the content, i.e. 

Vpe{L},i p = r. (3) 

2.2 Connectivity 

In this work, we use an explicit linear overhead to account for connectivity 
constraints. We also propose two simpler models that will serve for didactic 
purposes: perfect systems and limited fanout systems. 

2.2.1 Perfect systems 

In perfect systems, peers can arbitrarily use the upload capacity devoted to the 
service at no cost [5]. In particular, a perfect system possesses the following 
properties: 
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• No overhead: all the bandwidth capacity can be used to effective data 
transfer (goodput); 

• Unlimited fanout: one peer can send data to an arbitrary numbers of other 
peers simultaneously; 

• Stream continuity: the live stream can be divided into arbitrary small 
substreams of constant rate. 

2.2.2 Limited fanout 

As we will see in Section [51 optimizing perfect systems often leads to full mesh 
solutions, which are not very practical. A first idea to make the model more real- 
istic, without explicitly considering the overhead, is to assume that the number 
of non- null substreams r Pi9 is limited: each peer p has a limit c p on the number of 
outgoing connections it can sustain. This limited fanout implicitly acknowledges 
the fact that managing a connection has a cost. Perfect systems correspond to 
the extreme case c p = oo, for all p £ {L, S}). 

2.2.3 Explicit linear overhead 

In order to get a more realistic and flexible model of real systems, we propose to 
assume that the overhead is linear: the actual bandwidth used for sending some 
content at rate e from one peer to another is (1 -I- a)e + 6, for some constants 
a, b > 0. a is the proportional cost and b the additive cost. For simplicity, we 
consider that the overhead cost is supported by the sender only (this assumption 
will be discussed in 16.2. 5[) . 

The motivation for this model is that most existing sources of overhead are, 
at least in a rough approximation, proportional or additive: 

• Periodic signaling messages (keep-alive, overlay maintenance) are additive; 

• In chunk-based systems, the stream is split into atomic units of data (the 
chunks) that are distributed independently. For a constant chunk size, 
the signaling for sending one chunk is expected to induce a proportional 
overhead; 

• The cost for initiating a connection, averaged over the lifetime of that 
connection, can be considered as additive; 

• Some randomized diffusion scheme can have a non-null probability to to 
send useless data, because it is outdated or redundant |2J. This can be 
considered as proportional overhead. 

Under the linear overhead model, a peer of bandwidth u maintaining c out- 
going connections has a useful output limited to "~^ c . For b > 0, [_fj is the max- 
imal fanout sustainable by that peer. For b = 0, the model is indeed equivalent 
to perfect systems, except that all bandwidth capacities have to be normalized 
by 77—. 

J l+a 

The notation used is summarized in Table [T] 
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Table 1: Table of notation 


r 


Streamrate of the content (constant) 


Up 


Available upload bandwidth of peer p 


Ux/ux 


Total/average upload capacity of population X 


N x 


Number of nodes in X 


N c 


Normalized capacity of servers (Uc = NcR) 




Input rate of node p 


r p,q 


Substream from p to q 


Vd(X) 


Efficiency of set X in diffusion scheme d 


Cp 


Fanout of peer p 


a 


Proportional cost of a connection 


b 


Constant cost of a connection 


R:= {l + a)r + b 


Bandwidth consummed by goodput r 



3 Related work 

Understanding the bandwidth dimensioning is a crucial question in P2P systems, 
as upload bandwidth is a scarce resource. The bandwidth conservation law [T] 
tells that, if all available bandwidth resources can be used to useful content 
transfer, then the condition for a live streaming system to admit a solution is 

a L + Pas + -irf- > 1, with j ® X _ Ns r ' (4) 
iv L IP- JVx, ■ 

In reality, not all bandwidth can be used all the time. Of course, there is 
the issue of overhead, but other phenomena can prevent from using all available 
bandwidth. For instance, a peer may have nothing to give at a given time; or 
some bandwidth may be required for other purposes than feeding the leechers. 
This explains the concept of efficiency. Taking efficiency into account, equation 
(U) becomes 

r,(L)aL + v(S)0as+v(C)^;>l, (g) 
where r](X) is the efficiency of set X. 

Efficiency was introduced by Qiu and Srikant [6] for Bit Torrent-like file- 
sharing systems [3] . Its role was to quantify the fact that leechers cannot always 
upload at full bandwidth capacity, as they may lack the content required by 
others. 

In the case of standard peer-assisted live streaming, with no seeders (5 = 0), 
Liu et al. have shown that one can reach rj(L) = ?y(C) = 1 for perfect and limited 
fanout systems. In other words, a perfect use of the available bandwidth can be 
achieved [5]. 

4 Defining seeders' efficiencies 

We propose to extend the concept of efficiency to seeders as follows: in a given 
diffusion scheme d, the efficiency rjd(s) of a seeder s is the ratio between the 
data bandwidth it adds to the system and its upload bandwidth u s . In the 
bandwidth budget, we need to acknowledge that the input rate i s received by 
s is "wasted" : the rate i s could have been directly sent to some leechers, but 
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instead it is sent to peer s, which does not want to watch the content. We say 
that s "removes" i s from the pool of useful resources, in a matter of speakingj. 
So if in d, s transmits at rates r s>Pl , . . . , r S)Pc to c other peers (Figure [T]). its 
efficiency is 

%(s ) := ^ ''^ '-. (6) 




Figure 1: Principle of live seeding 



The efficiency of a set X C S is defined the same way: we consider the dif- 
ference between what comes out of X and what enters, all reported to capacity: 



^s£X,qe{L,S\X} r s,q 



... ^sex,qe{L,s\x} ' l^ P £{c,L,s\x},sex r p,s 
Vd(X) := — ■ . (7) 

If we add and subtract the term X] s tgx * m ^ ne numerator in ([7]), we 
obtain a more compact expression for r/d'- 

vAx) = E ^ % (*K (8) 

Ux 

Equation ([5]) tells that f]d(X) is also the weighted average of the seeders 
individual efficiencies. 



2 In fact, deciding whose peer is responsible for the "waste" of i s is arbitrary, and one 
could decide to substract i s from the bandwidth of the senders. However, making the seeders 
responsible for their own input rates make the analysis simpler. 
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4.1 Optimal efficiency 

The optimal efficiency t]opt(s) of a seeder s in a given system is defined as the 
suprcmum of the efficiencies it can get over all possible diffusion schemes. 

Vopt(s) = sup(7y d (s)) (9) 

d 

Vopt(s) is an upper bound for the proportion of the upload bandwidth that 
can be useful for that system. 

The same definition stands for the optimal efficiency of any subset ICS: 

Vopt(X) = sMVd(X)) (10) 

d 

However, there is no guarantee that the individual optimal efficiencies of seeders 
can be aggregated, because they may correspond to distinct schemes (a counter- 
example is given in Section [5]) • As a consequence, Equation (jSJ becomes an 
inequality when considering optimal efficiency: 

Vopt(X) < — . (11) 

For convenience, subscripts may be omitted when there is no ambiguity. 
We may also use metonymic notation in order not to clutter notation: rj{y) may 
denote the efficiency of a seeder characterised by some property y (like the input 
rate, upload bandwidth, fanout, . . . ). 



5 Perfect and limited fanout systems 

In this section, we derive the optimal efficiency of seeders when there is no 
explicit overhead. 

5.1 Perfect systems 

The optimal performance of seeders in a perfect system is given by the following 
theorem: 

Theorem 1. The optimal efficiency of a subset X C S of seeders is 

V(X) = (l-±)wm(l,j£). (12) 

Proof. First we give a scheme that achieves the efficiency given by p"2")) . The 
scheme is the following: each seeder s € X receives from the servers a dis- 
tinct substream of rate (if Ux < Nlt) or jj^r (otherwise), and broadcasts 
that substream to the Nl leechers. Under that scheme, the input received 
by X from nodes outside X is min(^-,r), and the output given to leechers is 
min(J7x, NlT). Subtracting the input from the output and dividing by Ux gives 
the efficiency r](X) from (fT2)) . 

Then, we need to prove that rj{X) cannot be greater than (1 — -^-) min(l, 7^-)- 
If Ix is the input received by X in a given scheme, the corresponding useful 
output cannot be more that min(£/x, min(7x, t)Nl) because : 
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• Ux is the capacity of X; 

• min(J, r) is the maximal rate of information that X can get. The best it 
can achieve is to send that rate to the Nl leechers: sending it to more 
peers, for instance seeders outside X, would be ineffective because all 
leechers already get the information received by X. 

Given the input and output rates, and according to Equation ([7]). the efficiency 
of X for a given input Ix is bounded by 

mm(1 ' /x ^' r ^ ) -^' 
We deduce that the optimal efficiency is bounded by 

sup mm(l,/ x — ,r— ) - — 
I x >0 \ U X Ux Ux 

If Ux < N^r, we get a maximal efficiency 1 — for Ix = j^, and if Ux > 
N^r, we get r ^ N y ^ for Ix = T. Therefore the efficency is never more than 



(1 — min(l, jf^-)- This concludes the proof. □ 



Note that the condition Ux > N^r corresponds to an overprovisioned sys- 
tem, as the seeders from X have more bandwidth than required to feed the 
stream r to all leechers by themselves. In the definition of efficiency we pro- 
posed, it is normalized by the dedicated upload bandwidth, so overprovisioned 
systems naturally have lower efficiencies. On the other hand, for any non- 
overprovisioned system, Equation (fT2ll simplifies to 

= (13) 

In other words, seeders are asymptotically optimal in a perfect P2P live 
streaming system. The explanation is that the only bandwidth waste boils 
down to at most one streamrate redirected to them for replication. 

5.2 Limited fanout 

Each seeder s has now a limited fanout c s . Without loss of generality, we assume 
that V.s e S,c s < N L . 

Theorem 2. The optimal efficiency of a single seeder stS with limited con- 
nections c s is 

I nee 

77( S ) = (l--)min(l,^). (14) 

In particular, if rc s > u s ( the fanout is high enough for allowing to use all the 
upload of s ), we just have 

T)( S ) = 1--. (15) 
C s 

Proof. As s cannot reach more than c s peers, we just consider a sub-system 
made of C, s and c s leechers, and we conclude by applying Theorem [U with c s 
instead of Nl. □ 
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The bad news is that this result stands for a single seeder, and is not easy 
to extend to a set of seeders. Equation (jllj) can be a strict inequality, meaning 
that efficiency is lost in the process of making multiple seeders work together. 
Consider for instance a toy system made of Nl — 3 leechers and two seeders si 
and S2 with parameters u\ = |r, c\ — 2, U2 — r, 02 = 3. Using Equation (j!4p . 
we get tjopt(si) = \ and 77 OPT (s 2 ) = §, so 

Vopt(si)ui + rjop T (s 2 )u2 _ 17 
u\ + U2 30 

But if we try to find a scheme that maximize the efficiency of {si, s 2 }, the 
best solution leads to 

8 17 

J?opt({si,s 2 }) = ^5 K 30' 
The good news is that for specific scenarios, we can have t]opt(X) — 
^° gX ri ^ T ^ Us w This is for instance the case when X is proportionally ho- 
mogeneous. 

Theorem 3. Consider a set X C S that is proportionally homogeneous, i.e. 
there is a rate e so that u s — ec s for all s € X . Then, for Nx < L max ^"(c L e J 

Vopt(X) = = . (16) 

Note that although we did not precise e < r, it is an implicit condition: 
otherwise, the result only apply for Nx < 0, or in other words, the empty set. 

Corollary 1. If all seeders in X have the same upload u, maximal fanout c, 
and ifNx <[^tr\[f\, then 

Vopt(X) = 1--. (17) 
c 

Remark In the homogeneous case, if we neglect truncation effects, the condi- 
tion of Corollary Q] corresponds to U x < (N L - l)r 5 £ T . As (Nl - 1)^ > N L 
(because c < Nl), we get the sufficient condition Ux < tNl- Therefore, Corol- 
lary [T] can be interpreted as follows: in the homogeneous limited fanout model, 
up to truncation effects, efficiencies can be aggregated without loss for any non- 
overprovisionned subset X. 

Proof. Given Equations pip and (|15p , we just need to give a diffusion scheme 
d such that ri d (X) = ^pt{s) Us _ 

That diffusion scheme is the following: the streamrate r is divided into |_-J 
distinct substreams of rate e. We then build up to |_-J trees such that: each 
seeder s in X is an internal node for exactly one tree, having exactly c s = 
children; the leaves are taken among the leechers; a leecher can belong to several 
trees, but is contained at most once per tree. 

A given tree can have up to Nl leaves, but no more. We deduce that one 

tree can contain I n l-i — i seec iers, because a tree with k internal nodes 

(from X) has at most fc(max se x(c s ) — 1) + 1 leaves. So the rules of the scheme 
can be respected if N x < L^rJ |~J • 
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In the corresponding diffusion scheme, where each tree is used to transmit 
one of the |_§J distinct substreams of rate e, we verify that each seeder s works 
at optimal efficiency 1 — — . Equation {5| concludes the theorem. The corollary 
is just a special case where e = — and max s6 x(c s ) = c. 

□ 

Remark We can see in the proof that the bound on Nx is actually related to 
the numbers of seeders that can fit in a tree with the constraints that each seeder 
s is an internal node with exactly c s children and there are no more than Nl 
leaves. The bound we gave is very conservative, because it assumes max s6 x(c 5 ) 
children for all seeders. It may not be tight, especially if c s spans a wide range. 
However, finding out the optimal number of seeders that can collaborate at 
optimal efficiency is difficult, as it is equivalent to solving a multiple knapsack 
problem. 

6 Explicit overhead 

From now on, we will focus on the explicit linear overhead model, with propor- 
tional cost a and additive cost b. Under this model, the bandwidth required for 
sending one copy of the stream through a single connection is R := (1 + a)r + b. 
One easily checks that r^max '■— is the maximal efficiency achievable in our 
model for any peer (leecher or seeder). 

When illustrating our results with numerical example, we consider a live 
streaming system with r — 100 KBytes/s, a proportional overhead of 10% (a = 
0.1), and two possible additive costs, small (b = 1.7 KBytes/s) and large (b = 25 
KBytes/s). In the figure, we use the relative efficiency r)/r] max instead of r\, in 
order to facilitate the comparison between the two overhead settings. 

6.1 Efficiency of a single seeder: main theorem 

The following theorem gives the optimal efficiency of one single seeder when the 
overhead is linear. 

Theorem 4. If the overhead follows a linear function, then the optimal effi- 
ciency of a seeder s is r/(s) = ^ NL ~~^ r if u s > N^R. If u s < N^R, then we have 

{0 _ if < u s < 2b, 

^P~ £l ( Us ) i f2b<u % <^, (18) 
Jj-^-e 2 (M s ) if u s > ^ , with 

o< £i ( Us )< t ^(a) 1 , 

< e 2 {u s ) < < .(I) 2 . 

Proof. The easy part of the proof is for u s > N^R. This corresponds to an over- 
provisioned situation where s alone can provide the live content to all leechers. 
This is the optimal scheme for s, so it is straightforward that rj(s) = J ir ■ 

Equation (fl8|) . which corresponds to the case u s < N^R, can be proved in 
three steps: 



RR n° 7608 



Live Seeding 



13 



• Finding the maximal efficiency for a given bandwidth u and fanout c; 

• Maximizing the corresponding equations for a continuous c; 

• Bounding the gap induced by the fact that c has to be an integer. 

6.1.1 Maximizing 77 for given u, c 

we first notice that for achieving maximal efficiency, all output rates have to 
be equal to the input rate: if it is not the case in a given scheme, replacing 
all output rates by their average value allows to reduce the input rate to that 
average value (it had to be greater than the maximal output in the original 
case) , increasing efficiency. Therefore the optimal efficiency must be of the form 
T](s) = ( c-1 ) e i for some rate < e < r. It is then obvious that one have interest 
to choose the highest value of e that is feasible. 

Note that if c = 1, the seeder can only replicate its input and has null 
efficiency; the seeder needs to maintain at least 2 connections with spared 
bandwidth to have a non-null efficiency. This settles that 77 = for u < 2b. 
Otherwise, two cases are to be considered: 

• if c is the bottleneck (this happens for u > Rc), then s has enough band- 
width to broadcast the whole stream r to c targets, achieving efficiency 
(c-l)r , 

u ' 

• if u is the bottleneck (for u < Rc) , then the optimal input rate e is solution 
of c((l + a)e + b) = u, leading to e = -jfrj. Corresponding efficiency is 

(c-l) e = ( c -l)(tt-6) 
u (1 + a)u 

1- i- Hc-i) 

c u\ '_ 

~ 1 + a 

For u < RNl, the bottleneck is necessary one of the above, so we deduce 
that the optimal efficiency for given u and c is 

rj(u,c) = mm( '—, ^— ^ ) (19) 

u 1 + a 

6.1.2 Maximizing 77 for given u 

We now see (TIT?]) as a function of c and try to find its maximal value. We propose 
to first solve the problem in K. before considering integers. 
We introduce 

(c-l)r 

771(c) := and 

7/ 

(1-1)-A( C -1) 

772(c) := s-r- f . 

1 + a 

The two functions have the following properties: 

• 771 is always increasing, and positive for c > 1; 
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• r]2 goes to — oo for c going to and +00. It has a unique maximum 
which is reached for c = ,/x 



• 7 ?i = ^2 for c = 1 (corresponding efficiency is 0) and c = (corresponding 
efficiency is — -). 

We deduce that the optimal efficiency for given c, 77 = min^i,?^) is equal 
to 771 for 1 < c < ^ and 772 for c > ^. Two cases are then to be considered: 

• if ^fj < -|r (that is u > ^-), then r] is increasing for 1 < c < J|, decreasing 
for c > . The maximal efficiency is therefore — £ , reached for c = j| ; 

• if > ^ (that is it < ^5-), then the maximal efficiency is the one of 772, 



(1- 

— Y^f — , reached for c = y b . 
6.1.3 Bounding the quantification gap 

While the optimal value Copt we found is a real number, only integer value are 
eligible. However, as the function 77 = min(?7i, 772) always admits a unique maxi- 
mum, the effective optimal efficiency ?7(s) is necessarily max(?7( [coptJ ) ; t){\ c Opt\ ))• 
In particular, we have t\{copt + 1) < Tj{s) < v( c opt), from which we deduce 

i](s) = t](copt) - e. with < e < t)(copt) - v( c opt + 1) 
From there, noticing that 77 = 772 for c > Copt, we get 

b 1 

vycopT) - v(copt + 1) - 



1 



If u < then cqpt — \/p so we get 



t](copt) ~ r\(c PT + 1) = 



< 



4(1 

u V 



1+7P 



(1 + a) 
(i)l 



(1 + a) ' 



• if w > we just use 



v(copt) - r\{coPT + 1) < 



u(l + a)' 

and note that - < (4) 2 - This concludes the proof. 



6.2 Efficiency of a single seeder: discussion 

Following theorem [?] and proof, the following remarks can be made. 



□ 
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Figure 2: Validity of the continuous approximation of the optimal efficiency 
6.2.1 Closed formulas approximation 

the ei and e-i terms are negligible as long as u s is big enough compared to the 
additive cost 6, so in most cases, one can safely use the continuous optimum 
v( c opt) (step 2) of the proof) instead of the discrete one max(r)(lcoPT\ ), v{\ c opt] ))• 
In other words, 



To illustrate the validity of this approximation, Figure [2] compares it to the 
exact efficiency for the two numerical settings we proposed at the beginning 
of this Section. We can see that the difference is barely noticeable for a large 
additive overhead, and invisible for a small one. 

6.2.2 Low/medium bandwidth 

The case u s < can be interpreted as the upload bandwidth is no more than £ 
times the rate R. In most practical situations, one would expect b -C R, so most 
seeders would probably fall in this case, which corresponds to low, medium and 
reasonably high bandwidths. 

Within this range, it is interesting to note that both the optimal number 
of connection and corresponding efficiency are independent of r. Moreover, one 
can note that the number of connections, \/tFi ^ s quite similar to the empirical 
formula used in the current Bit Torrent mainline client, \/0.6u [3J. This makes 
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us think that the results given here could be adapted to other scenarios than 
live seeding (this would need to be further investigated in a future work). The 
0.6 factor would corresponds to an additive connection cost b w 1.7 KBytes/s, 
which explains why we use this value as one of our numerical settings (the other 
value, b = 25 KBytes/s, is totally arbitrary). 

6.2.3 (Very) high bandwidth 

For very high bandwidths (corresponding for instance to seeders managed by 
some provider) , the efficiency tends to ?7 max as u s goes to infinity (under the as- 
sumption that the scenario is not overprovisioned, i.e. < R): super-seeders 
can asymptotically reach the best achievable efficiency given the overhead con- 
straints. 

6.2.4 Importance of input shaping 

Seeders do not need to get the whole streamrate. This fact allows to adjust 
their input rate as desired, which is a key to achieve optimal efficiency. 

For instance, under the assumption that the input rate of a seeder s is r, 
one easily checks that its best achievable efficiency is 




Upload bandwidth u (in KBytes/s) 
Figure 3: Impact of a badly shaped input rate 
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Figure 4: General overhead model vs simple overhead model 



Figure [3] gives a graphical comparison of 770 pt and r\ r . While seeders with 
optimized input rates can get a decent efficiency starting from a few 6's of upload 
bandwidth, if the input is r, seeders with an upload bandwidth less than R are 
totally inefficient (they cannot give more than they receive, so the best choice 
is not to use them). We also notice that the difference remains important even 
for higher upload bandwidth, especially if the additive overhead is small. 



6.2.5 About receiver-side overhead 

In our model, we made the assumption that the burden of the overhead was 
only on the sender. A more general model would consist in assuming that in 
addition to the sender overhead of parameters (a, 6), there is a receiver overhead 
of parameters (a r ,b r ) (if p receives a streamrate r q>p from q, it has to use an 
upload bandwidth a r r qiP + b r ). 

Theorem @] and proof can be adapted to the general model, at the price of 
increased complexity. For instance, in the medium range scenario (26 < u s < 
4-), we have an optimal (continuous) number of connections 

copt = M. (22) 
In the general model, this would become 



— a r b + y/b (a + a r + 1) (u — bd — ab r + a r b + au) 

copt = tt — r-rr • (23) 

0(0 + 1) 

We see that formulas get much more complex in the general model. However, 
if one compares the practical values given by (|22l) and ((23]) . we see that the 
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general behavior remains practically the same. This is depicted in Figure 0] 
(receiver overhead is assumed to be the same that the sender overhead, i.e. 
a r := a and b r := b). 

As the added complexity does not seem to bring lot of practical difference, 
we choose to discard the receiver overhead in our model. However, the reason we 
can do that is probably that the natural use of live seeders is to feed them with 
a single input rate, which reduce the impact of receiver overhead. If we want 
to extend our framework to leechers, which usually receive multiple substreams 
from multiple sources, a proper modeling of the receiver overhead may become 
mandatory. 



6.3 Efficiency of a set of seeders 

Like for the limited fanout model, there is no guarantee that the optimal single 
efficiencies of seeder can be aggregated in a common scheme. In the following, we 
propose two heuristics that allow to somehow adapt Theorem [3] to the overhead 
model: the mono-rate and dichotomic rates diffusion schemes. 



6.3.1 Mono-rate scheme 

The idea of the mono-rate approach is somehow simple: if a set of seeders agree 
to a common substream rate e, they can behave as a proportionally heteroge- 
neous set. Their efficiency obeys to the following theorem: 

Theorem 5. Consider a set ICS that verifies: 

• < ; 

• N X < L r ».^ 1 ., J J LHJ; «** E = 

Then, if all seeders on X agree on a common rate e := used for all inputs 
and outputs, the efficiency T] e (X) of the corresponding scheme verifies 

(i - ,/1m 2 (i - A /X) 2 

t± < Ve(X) < 24 

1 + a 1 + a 

Proof. Consider a given rate e < r. Call E := (1 + a)e + b the corresponding 
rate with overhead. The maximal efficiency of a seeder s having e as input and 
ouputs is reached when s opens the maximal number of outgoing connections 
allowing to stream e. This leads to 

??e(s) = u s • 

In particular, 

e „ e . . e e 

--2— <T)e (s) < ■=-— ■ 

h u s hj u s 

Assume that the number of seeders in X is small enough to allow perfect ag- 
gregation of efficiencies, like for Theorem [3] (the corresponding condition will be 
derived later). We then have rj e (X) = ^■^^J l ^ s ^ Us ; therefore 

e „ e .,„. _ e e 
--2—<Tk(X) < p-— ■ 
hi ux h ux 
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(i /JT) 2 

The maximal value of -| — ^- is jq^p — , proving the right part of . The 

(i_ fkff r— 
maximal value of % — 2-^— is tt^ — , and it is reached for E = \ As 

i/ ux 1+a 1 y 2 

we have E < R, this implies -Ox < 

We then need to give a sufficient condition for aggregating the efficiencies 
without losses. We can use the condition from TheoremU N x < [ max N e L ~(c )-J L§J ■ 
Noticing that c s = [_^J allows to conclude. □ 

6.3.2 Dichotomic scheme 

The dichotomic approach consists in the diffusion of several substreams whose 
rates are dividers of r, instead of using a single rate e. In details, the predeter- 
mined substreams are: 

• The video stream of rate r, which can be split into 

• 2 non-overlapping substreams of rate ^ , each of which can be split into 2 
substreams 

«... 

• 2 fe,nax non-overlapping substreams of rate 2 j/ ax , for some /c max > 0. 

k is called the level of a substream of rate ^ 

A seeder s is said to operate at level k if it behaves as follows: 

• it receives as input a level k substream; let I := k be his working level; 

• As long as s has a residual upload bandwidth greater than b and I < k max , 
do: 

— if there is not enough residual upload bandwidth to establish a new 
output of level I , 

— then 1 = 1+1 (a children substream of the current level I substream 
is chosen), 

— else create a new output of level I . 

The corresponding efficiency is denoted %(s). In order to optimize the 
dichotomic approach, each seeder operates at a level that maximizes its single 
efficiency, i.e. chooses a level k s such that rjk a (s) = maxo<fe<fe mai£ T)k(s). The 
corresponding efficiency is denoted rjBin(s). 

As the operating rate is necessarily a divider of r, r]Bin(s) is necessarily sub- 
optimal. However, the different levels allow enough freedom to get an efficiency 
close enough to be optimal. For instance, Figure [5] gives a graphical comparison 
of T]Bin(s) and t)opt(s), using fc max = Llog 2 (-^)J (this is an arbitrary choice that 
corresponds to stopping the subdivision when substreams need more overhead 
that their actual goodput). One observes that the individual efficiency loss is 
quite sustainable, especially for a low additive overhead. 

For a given set X of seeders, the construction of a dichotomic diffusion 
scheme is rather simple: 
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Figure 5: Dichotomic vs optimal individual efficiencies 



• all seeders operating at level k organize to achieve up to 2 diffusions tree 
for the level k; each seeder try to join the level k diffusion tree which 
currently possesses less leaves. 

• if a level k seeder has outputs of level k' > k, they can either be directly 
transmitted to leechers or serve as root for a level k! diffusion tree; 

• if some seeders at level k miss the input streamrate to build their diffusion 
scheme, they may use a leaf from a parent substream diffusion tree (some 
of parent rate will be wasted) . 

Under some conditions, we can evaluate the efficiency of X under a di- 
chotomic diffusion. 

Theorem 6. If, for a given set X C S , we have Ux < N^R, and if all non- 
empty diffusion trees can be rooted with proper input, then the efficiency rjBin(X) 
of X under a dichotomic diffusion verifies 

SsGX r lBin(s)u s rfc max . . SsGX VBin (s)u s 

Tj FT — - VBin(X) < (25 ) 

Ux Ux Ux 

The interpretation is the following: up to a term r ^ ax , which is small if 
Ux is big enough, the individual dichotomic efficiencies, which are close to the 
optimal individual efficiencies, can be aggregated without loss. 

Proof. The condition Ux < N^R ensures that no diffusion tree has more leaves 
than there are leechers in need of the corresponding substream. This can be 
shown by induction: 
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• at level 0, the diffusion tree cannot have more than leaves, which is 
smaller than Nl- 

• at level k, let Uk denote the bandwidth that remains after the bandwidth 
consumed from lower level is substracted; let Nk the maximal number of 
leechers that can be leaves at that level (a given leecher is counted with 
multiplicity equal to the number the level k substream it needs; let M/~ 
the number of leechers that get a level fc substream (with multiplicity). 
Note the relation Nk — 2(Nk-i — Alk-i), i-e. the maximal number at 
a given level is twice the slots that have not been filled in the previous 
level. Assume that Uk-i < Nk-i((l + a)-^=rr + b), that is at level k — 1, 
the residual bandwidth is not overprovisioned compared to the number of 
possible leaves Then we have 

U k < Uk-i-M k -i{(l + a)^- T + b) 

< (Nk-i-M k -i){(l + a)^ + b) 

< Nk((l + a)^ + ^)<N k ((l + a)^ + b) 

So at any given level, a diffusion tree can always find a leecher to give its 
output to. Therefore the only waste compared with individual efficiencies lies 
when the root input of a tree comes from a parent substream. This is bounded 
by r when considering all roots at a given level k > 0, leading to a total waste 
bounded by rfc max - Normalizing by Ux concludes the proof. □ 

6.3.3 Comparison of the two methods 

The mono-rate approach is simple to describe, which makes it a good proof 
of concept of using multiple seeders in a system with overhead. However, the 
dichotomic approach, although more complex, has many advantages over the 
mono-rate approach that make it more suitable for a practical use. 

Firstly, the substreams are pre-determined, while mono-rate requires to de- 
termine the proper input rate e, which depends on ux- Among other things, 
this facilitate considerably the interaction with the leechers' diffusion process. 
Furthermore, under the dichotomic approach, a seeder s can determine its oper- 
ating level by itself (it is just a function of u s ) while in the mono-rate approach, 
knowing ux implies some knowledge of the whole set X. This is even worse when 
considering dynamics in A: A change in e = f{ux) requires a complete upset 
of the diffusion trees in the mono-rate approach, while changes are expected to 
be mostly local in the dichotomic approach. 

Also note that as streamrate are dividers of r, the quantification effect |_-J 
that may limit the mono-rate approach (cf Theorem [S]) has no equivalent in the 
dichotomic approach. 

Finally, the mono-rate approach can force lot of seeders to use an input rate 
that is far from the single seeder optimal. This impact is bounded (cf Theorem 
O, but can be non negligible, especially if the seeders' bandwidths are highly 
heterogeneous. In contrast, the dichotomic approach adjusts afor each seeder s 
a level k s such that the input rate is to far from the optimal. 
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7 Discussion 

7.1 Leecher diffusion process 

We did not consider in details the way to make the diffusion processes of leechers 
and seeders work together. This is a problem in itself, which deserves a separate 
study. The study performed in [5] seems to be adaptable to the case with seeders, 
at least for the limited fanout model, but a further work is required to transpose 
the results to the overhead model (including keeping in mind the existence of 
receiver-side overhead). 

However, we argue that knowing how to optimize the diffusion process of 
seeders alone is not a bad starting point. 

7.2 Make a minimal use of seeders 

While all this paper is devoted to make the best possible use of seeders, we 
should recall that in the design of a real system, targeting the maximal seeder 
efficiency is not necessarily the smartest thing to do. 

In fact, seeders "waste" their input rate by design, which makes them inher- 
ently less efficient that leechers. Therefore, one should use seeders as minimally 
as possible. The proper way to use seeders is: 

• Try to achieve the most of the content diffusion by using the servers and 
leechers alone. If possible, the leechers should perform a lossless diffusion 
of a common substream of rate r' < r among all of them instead of a 
partial or lossy diffusion of rate r; 

• if r' < r, use seeders to finish the job. This is were the results of this 
paper apply, which describe the best one can expect from seeders and how 
to achieve it. 

7.3 Application: dimensioning a scalable live streaming 
system 

Many dimensioning rules can be derived by using the formulas we proposed. 
For instance, determining if the system is scalable would consist in checking 
if rj(L)aL + r)(S)fioes > 1 [!]■ If we assume here for simplicity homogeneous 
bandwidth u, rj(S) = tiopt(u) (neglecting aggregation issues), and optimal 
leechers' efficiency t)l = ?]maxU, one can derive the relationship that u and (3 
must verify for the system to be scalable: 

T 

(3t]opt(u) > r) max . (26) 

u 

If f3, which indicates the ratio between idle (seeders) and active (leechers) 
users, is a given parameter of the system, Equation (|26l) can be used to derive the 
bandwidth u that is required for the system to be scalable. This is illustrated 
by Figure [5] (the performance of the perfect system, i.e. a = b = 0, is also 
plotted for comparison). Notice how even little values of (3 (less than 1) can 

3 The efficiency of leechers should take into account the number of outgoing connections 
like we did for the seeders. However, T)l is not the main matter of this paper, so we assume 
without remorse perfect efficiency ?7max ■ 
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Figure 6: Average bandwidth required for scalability 

give significant decrease of the required bandwidth, which is R for a seedless 
system with perfectly efficient leechers. 

7.4 About delays 

We do not have taken delay issues into account. The diffusion delay is obviously 
a major concern in the design of a live streaming system. However, it should be 
noted that the two heuristics we proposed are based on diffusion trees. Therefore 
the induced delay is at most equal to the delay of a single connection times a 
logarithm of Nl. This is exactly the same type of delay that is experienced 
for diffusion based on leechers only, so we argue that using seeders should not 
impact the delay performance of a P2P live streaming system. 

8 Conclusion 

In this paper, we gave the keys to understand how seeders could be used in 
P2P live streaming if servers and leechers do not suffice. After a preliminary 
work on perfect and limited-fanout systems, we conducted our study on a model 
with linear overhead. Although this is a preliminary study, with results that 
are more theoretical than practical, we believe that the present work may have 
a significant impact in the design and dimensioning of live streaming systems 
using seeders. 

In a future work, we plan to pursue the matter of leechers/seeders interaction 
in the general overhead model. We also think that the concept of live seeders 
introduced here could be extended to a more general concept of half-seeders, 
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i.e. seeders with not all resources expected from a traditional seeder. Studying 
half-seeders could allow to extend our results to all P2P content distribution 
systems, including file-sharing and Video-on-Demand systems. 
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